update eval-driven-dev skill by yiouli · Pull Request #1434 · github/awesome-copilot

yiouli · 2026-04-17T22:29:46Z

Pull Request Checklist

I have read and followed the CONTRIBUTING.md guidelines.
I have read and followed the Guidance for submissions involving paid services.
My contribution adds a new instruction, prompt, agent, skill, or workflow file in the correct directory.
The file follows the required naming convention.
The content is clearly structured and follows the example format.
I have tested my instructions, prompt, agent, skill, or workflow with GitHub Copilot.
I have run npm start and verified that README.md is up to date.
I am targeting the staged branch for this pull request.

Description

Update eval-driven-dev skill: Adding comprehensive analysis step after evaluation runs.

Type of Contribution

Additional Notes

By submitting this pull request, I confirm that my contribution abides by the Code of Conduct and will be licensed under the MIT License.

github-actions · 2026-04-17T22:30:15Z

🔍 Skill Validator Results

⚠️ Warnings or advisories found

Scope	Checked
Skills	1
Agents	1
Total	2
Severity	Count
---	---:
❌ Errors	0
⚠️ Warnings	2
ℹ️ Advisories	0

Summary

Level	Finding
ℹ️	Found 1 skill(s)
ℹ️	[eval-driven-dev] 📊 eval-driven-dev: 3,768 BPE tokens [chars/4: 4,311] (standard ~), 16 sections, 1 code blocks
ℹ️	[eval-driven-dev] ⚠ Skill is 3,768 BPE tokens (chars/4 estimate: 4,311) — approaching "comprehensive" range where gains diminish.
ℹ️	[eval-driven-dev] ⚠ No numbered workflow steps — agents follow sequenced procedures more reliably.
ℹ️	✅ All checks passed (1 skill(s))

Full validator output

```text Found 1 skill(s) [eval-driven-dev] 📊 eval-driven-dev: 3,768 BPE tokens [chars/4: 4,311] (standard ~), 16 sections, 1 code blocks [eval-driven-dev] ⚠ Skill is 3,768 BPE tokens (chars/4 estimate: 4,311) — approaching "comprehensive" range where gains diminish. [eval-driven-dev] ⚠ No numbered workflow steps — agents follow sequenced procedures more reliably. ✅ All checks passed (1 skill(s)) ```

Copilot

Pull request overview

Updates the eval-driven-dev skill to align with newer pixie-qa concepts (notably input_data, agent evaluators, and structured post-run analysis) and expands the skill’s references to include a dedicated Step 6 analysis workflow and runnable implementation examples.

Changes:

Updates the skill metadata and setup workflow to target pixie-qa >=0.8.1,<0.9.0 and revises setup/error-handling guidance.
Refactors the skill’s step-by-step reference docs (new Step 1a project analysis, split Step 2, new Step 6 “Analyze Outcomes”, removal of older combined/iteration docs).
Adds runnable examples (standalone function, FastAPI, CLI) and updates API reference docs to reflect the newer dataset shapes (input_data etc.).

Reviewed changes

Copilot reviewed 22 out of 22 changed files in this pull request and generated 12 comments.

Show a summary per file

File	Description
skills/eval-driven-dev/resources/setup.sh	Updates install/upgrade logic and adds stricter failure handling for pixie init/start.
skills/eval-driven-dev/references/wrap-api.md	Updates wrap API reference and CLI wording (including dataset field naming).
skills/eval-driven-dev/references/testing-api.md	Updates testing API reference to match new dataset schema and runner behavior.
skills/eval-driven-dev/references/evaluators.md	Adds `create_agent_evaluator` reference and updates evaluator selection guidance.
skills/eval-driven-dev/references/6-investigate.md	Removes prior Step 6 “investigate/iterate” reference.
skills/eval-driven-dev/references/6-analyze-outcomes.md	Adds new structured, multi-phase Step 6 analysis workflow and required outputs.
skills/eval-driven-dev/references/5-run-tests.md	Reframes Step 5 as “run tests and fix mechanical issues” and updates commands/content.
skills/eval-driven-dev/references/4-build-dataset.md	Updates dataset schema (`input_data`), adds realism audits, and expands capture guidance.
skills/eval-driven-dev/references/3-define-evaluators.md	Shifts evaluator strategy toward agent evaluators and updates mapping guidance.
skills/eval-driven-dev/references/2c-capture-and-verify-trace.md	Adds a dedicated sub-step doc for trace capture and verification.
skills/eval-driven-dev/references/2b-implement-runnable.md	Adds a dedicated sub-step doc for Runnable implementation and placement.
skills/eval-driven-dev/references/2a-instrumentation.md	Adds a dedicated sub-step doc for `wrap()` instrumentation practices.
skills/eval-driven-dev/references/2-wrap-and-trace.md	Removes older combined Step 2 reference.
skills/eval-driven-dev/references/1-c-eval-criteria.md	Adds updated eval criteria guidance tied to project analysis and failure modes.
skills/eval-driven-dev/references/1-b-eval-criteria.md	Removes older Step 1b eval criteria reference.
skills/eval-driven-dev/references/1-b-entry-point.md	Renumbers/updates entry-point documentation and emphasizes capability prioritization.
skills/eval-driven-dev/references/1-a-project-analysis.md	Adds new Step 1a project analysis reference and required outputs.
skills/eval-driven-dev/references/runnable-examples/standalone-function.md	Adds runnable example for direct function invocation.
skills/eval-driven-dev/references/runnable-examples/fastapi-web-server.md	Adds runnable example for FastAPI/ASGI in-process evaluation.
skills/eval-driven-dev/references/runnable-examples/cli-app.md	Adds runnable example for CLI subprocess execution.
skills/eval-driven-dev/SKILL.md	Updates skill description/versioning and rewrites the step flow to include analysis.
docs/README.skills.md	Updates the skills index entry for `eval-driven-dev` to match the new references list.

update eval-driven-dev skill

03a5f86

Copilot AI review requested due to automatic review settings April 17, 2026 22:29

yiouli requested a review from aaronpowell as a code owner April 17, 2026 22:29

Copilot started reviewing on behalf of yiouli April 17, 2026 22:30 View session

Copilot AI reviewed Apr 17, 2026

View reviewed changes

yiouli added 2 commits April 17, 2026 15:40

fix: update skill update command to use correct repository path

32f91ac

address comments.

fca9a9c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

update eval-driven-dev skill#1434

update eval-driven-dev skill#1434
yiouli wants to merge 3 commits intogithub:stagedfrom
yiouli:staged

yiouli commented Apr 17, 2026

Uh oh!

github-actions bot commented Apr 17, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

yiouli commented Apr 17, 2026

Pull Request Checklist

Description

Type of Contribution

Additional Notes

Uh oh!

github-actions bot commented Apr 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔍 Skill Validator Results

Summary

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

github-actions bot commented Apr 17, 2026 •

edited

Loading